What's Across the Border? Re-Evaluating the Cross-Border Evidence on Minimum Wage Effects
Priyaranjan Jha, David Neumark and Antonio Rodriguez-Lopez
Journal of Political Economy Microeconomics, forthcoming
February 2024

Tables and Figures:
To obtain all the tables and figures in the paper, run estimate-master.do in the "do/estimate" folder. Make sure to set your directory path in files "do/estimate/estimate-master.do" and "do/estimate/directories.do". Every table or figure file can also be run independently (just make sure your working directory is "do/estimate"). All output tables are in file "out/TablesJNR.xlsx" and all figures are in folder "out/figures".

Note: Stata packages reghdfe, xtevent, maptile, estout need to be installed for "estimate-master.do" to run smoothly. For Table 1, please install DLR's file "cluster2areg.ado", which is located in the "JNR_replication_files/ado folder". We also include file "do/estimate/table1_reghdfe.do", which does not require "cluster2areg.ado" and reports Table 1 using instead reghdfe and Cameron-Gelbach-Miller multiway clustered standard errors. 

Build files:
To build the samples used in this paper starting from raw County Business Patterns (CBP) data, run build-master.do in the "do/build" folder. Make sure to set your directory path in files "do/build/build-master.do" and "do/build/directories_build.do". Every build file can also be run independently (just make sure your working directory is "do/build"). Please read Readme_raw_folder_contents.txt in the "data/raw" folder for the internet location of raw CBP files.

Description of files needed BEFORE running "do/build" folder do files:

1) CBP national and county raw files: These files must be placed in the "data/raw" folder. For each year from 1990 to 2016 there must be a national file (cbp`y'us.txt) and a county file (cbp`y'co.txt). These files are not included in the replication package because they are large, but they can be downloaded from: https://www.census.gov/programs-surveys/cbp/data/datasets.html

2) "xwalks" folder files: This folder comes directly from AADHP's replication package at https://www.ddorn.net/data/AADHP-GreatSag-FileArchive.zip. These files are needed to convert all CBP files to the SIC87DD 479-industry classification used by AADHP. Since AADHP use data up to 2011, we added two files to the "xwalks" folder to be able to use 2012-2016 data. The file "xwalks/ind/nesting/naics12_nesting.dta" was created from https://www2.census.gov/programs-surveys/cbp/technical-documentation/reference/naics-descriptions/naics2012.txt. The file "xwalks/ind/naics/naics12_naics07.dta" was constructed using the Census bridge file EC1200CBDG1.xlsx and do file "xwalks/ind/naics/bridge_naics_2012_2016.do".

3) "data/czone/cw_cty_czone_mod_AADHP.dta": AADHP file that maps 3143 counties into 741 commuting zones.

4) "data/czone/cty_czone_state_boundaries.dta": Updates AADHP file 3) to include a multi-state dummy variable that indicates if the commuting zone includes counties from more than one state, and a three-state dummy variable that indicates if it is a three-state commuting zone.

5) "data/czone/czone_multi.dta": Maps 741 commuting zones to multi-state czone indicators.

6) "data/cbp/cbp_sic87dd_sic87ss_sic87xx_desc.dta": File that maps the 479 AADHP industries (SIC87DD) into 20 industries (SIC87XX). Our industry of interest, Restaurants, (SIC87DD 5812), has a 1-to-1 correspondence with the SIC87XX classification.

7) "data/pop/cty_pop_1990_2018.dta": County-level population data created from Census Population Estimates (https://www2.census.gov/programs-surveys/popest/datasets/) using AADHP's popest_reader.do file. It was updated up to 2018 by using Census's file cc-est2018-alldata.csv available at https://www2.census.gov/programs-surveys/popest/datasets/2010-2018/counties/asrh/.

8) "data/czone/cw_czone_region_AADHP.dta": AADHP's file that maps 741 commuting zones to 9 Census regional divisions.

9) "data/mw/VZ_state_annual.dta": Includes state and federal minimum wages for every state plus DC from 1974 to 2016. The minimum wage to use is max_mw (Annual state maximum). Comes from Vaghul and Zipperer (2016) and can be downloaded from https://github.com/equitablegrowth/VZ_historicalminwage/releases.

10) "data/dlr/cz_pairs_dlr.dta": Indicates all incomplete and complete czone-state pairs that can be constructed with DLR's QCEW data. There are 128 possible pairs, of which 55 are incomplete (pairtimes=1) and 73 are complete (pairtimes=2) -- see section 2.

11) "data/dlr/co_pairs_dlr.dta": Indicates all incomplete and complete county pairs in DLR's QCEW data. There are 754 pairs, of which 438 are incomplete (pairtimes=1) and 316 are complete (pairtimes=2) -- see section 2.

12) "data/dlr/dlrmscz_counties.dta": This file contains the DLR's list of the 1,139 US counties that lie along a state border -- obtained from "county-pair-list.txt" in the DLR replication package available at https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/L4DUZ7/8ODET4 -- plus the 197 MSCZ counties that do not lie on a state border. 

13) "data/dlr/dlrmscz_pairs_sic87xx_years.dta": This file was created from DLR's "county-pair-list.txt" and all potential non-contiguous county pairs to facilitate the creation of the sample for the county-pairs estimation. It contains the 1,181 DLR pairs plus 678 non-contiguous county pairs (each pair in same MSCZ), for each year, and for each of the SIC87XX 20 industries.


Description of "do/build" files:

"1_cbp_us_reader_AADHP.do": AADHP file to read CBP national files and convert them to SIC87DD 479-industry classification. Uses files in 2). Output file:
14) "data/cbp/cbp_national_1990_2016.dta"

"2_cbp_co_reader_AADHP_updated.do": AADHP file to read CBP county files and impute employment. Updated to include 2012-2016 data and to add imputed annual and quarterly payrolls. Uses files in 2) and 4). Generate county-level and czone-state level files. Output files:
15) "data/cbp/cbp_county_1990_2016.dta"
16) "data/cbp/cbp_czone_state_1990_2016.dta"

"3_build_wageranking_20ind.do": Uses 6) to collapse 15) to 20-industry (SIC87XX) classification, and then ranks these 20 industries according to nominal average earnings in 1990. Output files:
17) "data/cbp/cbp_national_20ind_1990_2016"
18) "data/cbp/cbp_sic87xx_wagerank.dta"

"4_builder_czone_state_pop.do": Creates population file at the czone-state level from county-level file. Uses 3) to collapse 7) to the czone-state level. Output file:
19) "data/pop/czone_state_pop_1990_2018.dta"

"5_builder_czone_state_20ind": Creates 20-industry czone-state level dataset. Uses 6) to collapse 16) to 20 industries, and then adds variables in 19) and 8). Output file:
20)  "data/cbp/cbp_czone_state_wages_20ind_1990_2016.dta"

"6_sample_builder_czone_state_20ind": Creates czone-state sample by adding 9) (minimum wage data) and 18) (ranking data) to file 20), and calculating variables of interest. Output file:
21) "data/sample/cbp_czone_state_20ind_sample.dta"

"7_sample_builder_czonestate": Creates sample with czone-state pairs for 20 industries, as well as main czone-state pair sample for the restaurant industry. Uses 21), 18), and 10). Output files:
22) "data/sample/cbp_czonestatepairs_20ind_sample.dta"
23) "data/sample/cbp_stacked_czonestatepair_sample.dta" -- Restaurants sample

"8_builder_county_state_20ind": Creates 20-industry county level dataset. Uses 6) to collapse 15) to 20 industries, and then adds variables in 4), 7) and 8). Output file:
24) "data/cbp/cbp_county_state_wages_20ind_1990_2016.dta"

"9_sample_builder_county_state_20ind": Creates county sample by adding 9) and 18) to 24), and calculating variables of interest. Output files:
25) "data/sample/cbp_county_state_20ind_sample.dta"

"10_sample_builder_counties": Creates county pair sample with border pairs + non-contiguos county pairs for 20 industries, as well as the county pair sample for the restaurant industry. Uses 12), 13) and 25), and then 18) and 11). Output files:
26) "data/sample/cbp_countypairs_20ind_sample.dta"
27) "data/sample/cbp_stacked_countypair_sample.dta" -- Restaurants sample

"11_sample_builder_DLR": Using DLR's DATA_SETUP.DO from their replication package, this files builds the county-pair and czone-state pair samples using DLR's QCEW data. These files are used for the replication exercise in section 2, Table 1. This file uses the following DLR's files (not listed above): QCEW_industrydata_DLR.dta, county-pair-list_DLR.txt, MW_yr_qtr_84_07_DLR.dta. For the czone-state file we also use 4) and file cz_pairs_153_dlrbuild.dta (the latter contains the 153 possible czone-state pairs). Output files:
28) "data/dlr/QCEWindustry_minwage_contig_co.dta"
29) "data/dlr/QCEWindustry_minwage_contig_cz.dta"

Other auxiliary files:

30) "data/county/countytypes.dta": Classifies counties according to the type of pairs they form (e.g. border county pair, pair not in same czone, pair from same czone, etc). A county can be of different types.

31) "data/county/countypairtypes.dta": Classifies each county pair according to the relationship between its two counties (contiguous border pair, contiguous not in same czone, contiguous in same czone, etc).

32) "data/county/countypairtypes_years.dta": Structure of county pairs and years for the simulation of Figure 7.

33) "data/dlr/DLR_complete_co_pairs_period.dta": List of county pairs by year with DLR-QCEW data. Used for calculation of correlations in Table 8.

34) "data/dlr/DLR_complete_cz_pairs_period.dta": List of czone-state pairs by year with DLR-QCEW data. Used for calculation of correlations in Table 8.

35) "data/czone/cbp_county_pairs.dta": List of county pairs by year with CBP data. Used for calculation of correlations in Table 8.

36) "data/czone/cbp_czone_pairs.dta": List of czone-state pairs by year with CBP data. Used for calculation of correlations in Table 8.

37) "data/mw/VZ_state_annual_event": Event data for minimum wage changes. Constructed from 9). Used for Figure 5.

38) "data/sample/stacked_czonestatepair_lags.dta": Stacked czone-state pairs from 1980 to 1989. Appended to stacked sample and merged with 37). Used for Figure 5.

39) "data/hhi/hhi92_county_rest_dem.dta": 1992 HHI data for the restaurant industry by county, and demeaned HHI for each type of county pairs in Table C-1. HHI data was calculated from NETS data.

40) "data/pop/countypairs_years_pop.dta": Population for each county-year in each county pair. Used for Figure C-1.



